Whole-Genome Sequencing of Staphylococcus aureus and Staphylococcus haemolyticus Clinical Isolates from Egypt

ABSTRACT Infections caused by antibiotic-resistant Staphylococcus are a global concern. This is true in the Middle East, where increasingly resistant Staphylococcus aureus and Staphylococcus haemolyticus strains have been detected. While extensive surveys have revealed the prevalence of infections caused by antibiotic-resistant staphylococci in Europe, Asia, and North America, the population structure of antibiotic-resistant staphylococci recovered from patients and clinical settings in Egypt remains uncharacterized. We performed whole-genome sequencing of 56 S. aureus and 10 S. haemolyticus isolates from Alexandria Main University Hospital; 46 of the S. aureus genomes and all 10 of the S. haemolyticus genomes carry mecA, which confers methicillin resistance. Supplemented with additional publicly available genomes from the other parts of the Middle East (34 S. aureus and 6 S. haemolyticus), we present the largest genomic study to date of staphylococcal isolates from the Middle East. These genomes include 20 S. aureus multilocus sequence types (MLST), including 3 new ones. They also include 9 S. haemolyticus MLSTs, including 1 new one. Phylogenomic analyses of each species’ core genome largely mirrored those of the MLSTs, irrespective of geographical origin. The hospital-acquired spa t037/ST239-SCCmec III/MLST CC8 clone represented the largest clade, comprising 22% of the S. aureus isolates. Like S. aureus genome surveys of other regions, these isolates from the Middle East have an open pangenome, a strong indicator of gene exchange of virulence factors and antibiotic resistance genes with other reservoirs. Our genome analyses will inform antibiotic stewardship and infection control plans in the Middle East. IMPORTANCE Staphylococci are understudied despite their prevalence within the Middle East. Methicillin-resistant Staphylococcus aureus (MRSA) is endemic to hospitals in Egypt, as are other antibiotic-resistant strains of S. aureus and S. haemolyticus. To provide insight into the strains circulating in Egypt, we performed whole-genome sequencing of 56 S. aureus and 10 S. haemolyticus isolates from Alexandria Main University Hospital. Through analysis of these genomes, as well as all available S. aureus and S. haemolyticus genomes from the Middle East (n = 40), we were able to produce a picture of the diversity in this region more complete than those afforded by traditional molecular typing strategies. For example, we identified 4 new MLSTs. Most strains harbored genes associated with multidrug resistance, toxin production, biofilm formation, and immune evasion. These data provide invaluable insight for future antibiotic stewardship and infection control within the Middle East.

The research article proposed by Montelongo et al entitled "Phylogenomic study of Staphylococcus aureus and Staphylococcus haemolyticus clinical isolates from Egypt"describes one of the important topics. In general the article is of great impact to readers working in the healthcare settings, especially as considering it performing the WGS for large number of isolates, however, some minor issues need to be addressed first before considering final publication. Those are shown below. 1-The author need to add in the introduction section a short paragraph about the clinical outcomes and the PVL positive S. aureus specially in Egypt. 2-PVL positive isolates are mainly accompanied with severe outcomes like the brain abscess so illustrating such correlation will raise the significant of the work. 3-The authors need to emphasize that the geographical location of Egypt plays an important role in the dissemination of varies lineage and types within this country. 4-Please rewrite the part of the open pangenome to be simpler and clearer to the readers. 5- Table 2: Are these all the identified list of the virulence factors?? too little.
Reviewer #3 (Comments for the Author): In this manuscript, Staphylococcus aureus and Staphylococcus haemolyticus clinical isolates from Egypt were collected. In total, 56 S. aureus and 10 S. haemolyticus isolates were collected and also performed WGS. Staphylococci are an important heterogenous group of pathogens which can cause severe infection. The study of antibiotic resistance and virulence for these two strains can bring a large impact for the clinical research. However, only minority of S. aureus and S. haemolyticus studies provided the information from Middle East which is the connection between Asia and Europe. The authors of this manuscript cooperated Alexandria Main University Hospital to fill in this gap. All the WGS data including contigs, scaffolds and annotations are opened for public usages. Moreover, the antibiotic-associated and virulence-associated genes also provide useful information for the future clinical study. Furthermore, the phylogenomic study revealed the relationship between region and MLST as well.
Based on the manuscript, I have several suggestion and questions: 1. For the reproducibility, the parameters that the authors assigned for the software should be provided. If the default setting was used, it also needs to be written in the manuscript. Otherwise, the reliability of the results may be decreased. Moreover, if some specific cutoffs were applied for the detections or predictions, they also need to be shown in the method section. For example, the criteria for searching the virulence-associated and antibiotic-associated genes listed at supplementary table 3 and 4 is not provided. The readers do not know how the authors run the software and how to obtain these results. 2. I would also suggest the authors to upload their scripts or commands for bioinformatic analyses online publicly. It can benefit the reproducibility and provide a guideline for the future applications as well. 3. From the S. aureus clinical isolates, several unknown MLSTs were found. It will be great if the authors can show more details about these unknown MLSTs. For example, the unknown MLST in figure 3 is quite close to ST-22, how different are they? 4. The statements at line 158-160 mentioned that "The genomes represent varied MLSTs. The 16 S. haemolyticus isolates examined here belonged to nine MLSTs, including a new genotype ST-74 (strain 51) assigned as a result of this study, and an isolate of unknown ST (strain 7A)". However, the data shown at figure 2 and figure S1 does not match with this statement. In both figure2 and figure S1, ST-74 is for strain 7A and unknown ST is for strain 51. But the data in the supplementary table 2 does match the statement in the manuscript. The authors need to double check the data. 5. In the section of methods, line 292-294, the authors mentioned that "A total of 8 S. aureus and 14 S. haemolyticus consecutive non-duplicate isolates were collected from the Medical Microbiology Laboratory at Alexandria Main University Hospital (AMUH) between September and December 2015." However, the results and whole descriptions in the manuscript are based on 56 S. aureus and 10 S. haemolyticus isolates. Why the authors only use parts of the collection for the analyses. Perhaps, I miss it, but I did not find out the reason. 6. Based on geographical location, Middle East is the bridge between Asia and Europe. It will be interesting to see the phylogenomic analysis of the strains from Asia, Middle East and Europe. The spread and evolution of Staphylococci can be understood. If the authors can provide such information, the value of this study can be increased.
The article "Phylogenomic study of Staphylococcus aureus and Staphylococcus haemolyticus clinical isolates from Egypt" gives a distinct look at the two most pathogenic Staphylococcus species and types/clades of these species that are circulating in the Middle Eastern region, mainly Egypt. Additionally, the authors isolate 4 newly identified, distinct genotype strains based on their analyses. While this is certainly valuable information, particularly because the strains were acquired from active hospital infections, some of the importance and novelty of this study is lost in the type/clade jargon that is used without any indication of why these distinct types/clades are different or important to acknowledge. The manuscript could be improved by keeping a broader audience in mind who may be interested in phylogenetics or antibiotic resistance without the specific knowledge of jargon used in the S. aureus field in relation to strain types/clades. Two of the 4 figures in the manuscript refer to these types/clades, so a more thorough introduction and discussion would be very beneficial to comprehending the importance of the manuscript.
In this manuscript, Staphylococcus aureus and Staphylococcus haemolyticus clinical isolates from Egypt were collected. In total, 56 S. aureus and 10 S. haemolyticus isolates were collected and also performed WGS. Staphylococci are an important heterogenous group of pathogens which can cause severe infection. The study of antibiotic resistance and virulence for these two strains can bring a large impact for the clinical research. However, only minority of S. aureus and S. haemolyticus studies provided the information from Middle East which is the connection between Asia and Europe. The authors of this manuscript cooperated Alexandria Main University Hospital to fill in this gap. All the WGS data including contigs, scaffolds and annotations are opened for public usages. Moreover, the antibiotic-associated and virulence-associated genes also provide useful information for the future clinical study. Furthermore, the phylogenomic study revealed the relationship between region and MLST as well.
Based on the manuscript, I have several suggestion and questions: 1. For the reproducibility, the parameters that the authors assigned for the software should be provided. If the default setting was used, it also needs to be written in the manuscript. Otherwise, the reliability of the results may be decreased.
Moreover, if some specific cutoffs were applied for the detections or predictions, they also need to be shown in the method section. For example, the criteria for searching the virulence-associated and antibiotic-associated genes listed at supplementary table 3 and 4 is not provided. The readers do not know how the authors run the software and how to obtain these results. 2. I would also suggest the authors to upload their scripts or commands for bioinformatic analyses online publicly. It can benefit the reproducibility and provide a guideline for the future applications as well. and 10 S. haemolyticus isolates. Why the authors only use parts of the collection for the analyses. Perhaps, I miss it, but I did not find out the reason.

Based on geographical location, Middle East is the bridge between Asia and
Europe. It will be interesting to see the phylogenomic analysis of the strains from Asia, Middle East and Europe. The spread and evolution of Staphylococci can be understood. If the authors can provide such information, the value of this study can be increased.

Editorial comments:
While I think the backbone of this paper is solid, it does need major revisions to make it understandable and digestible to a broader reading audience. The type/clade jargon could use some major explaining in the introduction and/or discussion to become understandable, and more statistical analyses should be done to support some of results.
We recognize that there is a lot of jargon surrounding Staphylococcus, including the multiple different methods for typing strains for global epidemiological studies. We have reviewed our manuscript and added to context of the typing jargon such that a broader audience can better understand and appreciate the results presented. Furthermore, we agree that it was difficult to interpret some of the results and discussion surrounding the clades. We have revised Figure 3 to include a visualization of the clades. We have also added this part to the introduction (Lines 96-99) "In that respect, WGS can be used to identify outbreak clones or clades, which are a group of independent isolates that share phenotypic and genotypic traits, most likely have a common ancestor, and form a branch on a phylogenetic tree (17-19)".
As it currently stands, the paper is just a description study of sequences. Addition of relevant elements in the discussion would enhance the scientific impact. Please, also make sure that the findings support conclusions.
Due to the dearth of S. aureus and S. haemolyticus sequence data from the Arab region, the conclusions that can be drawn from our analysis have limitations. We have made this clear to the reader. Nevertheless, it is an important study for: (1) understanding these two important pathogens in the region, and (2) laying the foundation to future global studies that investigate the role that this region, and in particular Egypt, plays in the transmission of Staphylococci endemic to Asia and Staphylococci endemic to Europe.
Since the study involves bioinformatic analyses, the parameters (details) should be included in the manuscript to provide relevant information how the analyses were performed. Different parameter sets may influence the results and internal statistical tests significantly. I would strongly recommend providing such information or uploading it online for reproducibility.
We recognize that there may have been some confusion, noted by Reviewer #3, regarding the bioinformatic analyses. No new scripts were created. Within the revised manuscript, we have made this clear as well as a more thorough discussion of the parameters used for every step of our analysis. We have fully addressed these concerns in our response to Reviewer #3.
Reviewer #1 (Comments for the Author): General comment In their study, authors aimed the use of genomic date to investigate circulating strains of S. aureus and S. haemolyticus in Egypt. Despite the strength of genomic data provided, the scientific content, the structure as well as the wording of this article need to be significantly improved. English revision should be done for resubmission. Example: Line 74 (the infection it can cause range) We have rephrased the specific statement indicated here. "S. aureus is arguably the most clinically important staphylococcal species; it can be the cause of mild erythema to serious life-threatening ailments, including septicemia, pneumonia, and endocarditis" (Lines 74-76) We also have reviewed the manuscript for wording.

Specific comments
The text of the Impact statement remains very descriptive and does not raise the impact of this study.
The biggest impact of our study is the simple fact that very few isolates have been sequenced from the Middle East. We included the 40 publicly available S. aureus and S. haemolyticus genomes available prior to our study in our comparative genomics study. We produced 56 S. aureus and 10 S. haemolyticus genomes as part of our study. This more than doubled the number of available sequenced isolates from the entire Middle East region. This is certainly an important contribution. The genomic investigation provides a more detailed view of the strains in circulation than traditional molecular typing strategies, which is the source of most of the current data in the Middle East. We have revised the impact statement to more clearly emphasize the significance of our work.
The Introduction should end at line 130. The following text describes the results.
We have made this change. The results section begins with a paragraph without a subtitle which ultimately describes more of a methodology and very few of the results.
The production of these genomes, which essentially doubles the number of publicly available genomes from Middle Eastern insolates for these two species, is a result. However, we recognize that the value of these genome sequences is through the comparative analysis. We have restructured the start of the Results such that the first subheading is "Strain genotyping" where we introduce the new 66 strains and their genomes. Moreover, we have focused the introduction of these results. (Lines 142-148) The discussion, which should be a discussion of the results, relates more elements that have their place in the introduction. The discussion is not hard-hitting and should be better developed.
Thank you for this comment. We have revised the discussion accordingly.
In Methods (Bacterial isolates), the authors speak of 89 S. aureus and 14 S. Haemolyticus, which does not correspond to the figures described in the abstract.

Thank you for noting this discrepancy. We have corrected the numbers reported in the methods: "Fifty-six S. aureus and 10 S. haemolyticus consecutive non-duplicate isolates were collected from the Medical Microbiology Laboratory at Alexandria Main University Hospital (AMUH) between September and December 2015." (Lines 302-304)
In this same paragraph, the authors describe poorly performing conventional methods for the identification of the genus Staphylococcus identification. I would have expected confirmation based on the genomic data in their possession.
Initially, the isolates' species were confirmed using conventional methods. These initial taxonomic classifications were later confirmed using the genomic data. We have made this fact clearer in the methods (Lines 336-339). checkM was run to assess completeness and contamination of the assemblies. This process was

conducted through PATRIC and the user is required to specify the genus. It's important to note that PATRIC by default ascertains if this is the right genus for comparison. The results of this first step in genome assembly evaluation confirmed our initial taxonomic designation. It was further confirmed upon submission to NCBI and PGAP annotation, which by default runs analyses to confirm the user-supplied genus and species.
The low number or non-Egyptian genomes in this study cannot allow a prevalence comparison among MLST types. Authors should use this genomic data with another angle.
We agree with the reviewer, the number of genomes from the Middle East is scant in comparison with, e.g., Europe and the USA. With the limited publicly available data, of which our contribution here more than doubled this resource, it is not possible at this time to speak of prevalence. However, our study is a very important first step in beginning to explore the diversity of lineages within the Middle East, and more specifically Egypt. We believe it is important that this is emphasized throughout the scientific literature; a global perspective of pathogen covalence is desperately needed.

#2 (Comments for the Author):
The research article proposed by Montelongo et al entitled "Phylogenomic study of Staphylococcus aureus and Staphylococcus haemolyticus clinical isolates from Egypt"describes one of the important topics. In general the article is of great impact to readers working in the healthcare settings, especially as considering it performing the WGS for large number of isolates, however, some minor issues need to be addressed first before considering final publication. Those are shown below. 1-The author need to add in the introduction section a short paragraph about the clinical outcomes and the PVL positive S. aureus specially in Egypt. 2-PVL positive isolates are mainly accompanied with severe outcomes like the brain abscess so illustrating such correlation will raise the significant of the work. A paragraph was added to the introduction to stress the significance and prevalence of PVL-positive isolates in Egypt.

See response to the prior comment.
3-The authors need to emphasize that the geographical location of Egypt plays an important role in the dissemination of varies lineage and types within this country.

First, we have emphasized this point in the introduction: "Because of its central location as well as its political and historical role, Egypt presents a unique case-study for staphylococcal distribution and exchange in the Arab region (33). Furthermore, Egypt's cultural and geographical placement may facilitate local Staphylococcal exposure to international lineages from the Middle East, as well as Asia, Europe, and Africa." (Lines 130-131)
We have emphasized this again in the discussion through our discussion of sequence types (STs) that are prevalent in Europe and Asia and are found in Egypt. (Lines 236-246) 4-Please rewrite the part of the open pangenome to be simpler and clearer to the readers.
We made revisions in the presentation of the pangenome results (beginning Line 174) as well as our discussion of this analysis (Line 254). We have also added detail to the methods regarding how these computations were performed (beginning Line 363). In this manuscript, Staphylococcus aureus and Staphylococcus haemolyticus clinical isolates from Egypt were collected. In total, 56 S. aureus and 10 S. haemolyticus isolates were collected and also performed WGS. Staphylococci are an important heterogenous group of pathogens which can cause severe infection. The study of antibiotic resistance and virulence for these two strains can bring a large impact for the clinical research. However, only minority of S. aureus and S. haemolyticus studies provided the information from Middle East which is the connection between Asia and Europe. The authors of this manuscript cooperated Alexandria Main University Hospital to fill in this gap. All the WGS data including contigs, scaffolds and annotations are opened for public usages. Moreover, the antibiotic-associated and virulence-associated genes also provide useful information for the future clinical study. Furthermore, the phylogenomic study revealed the relationship between region and MLST as well.

5-
Based on the manuscript, I have several suggestion and questions: 1. For the reproducibility, the parameters that the authors assigned for the software should be provided. If the default setting was used, it also needs to be written in the manuscript. Otherwise, the reliability of the results may be decreased. Moreover, if some specific cutoffs were applied for the detections or predictions, they also need to be shown in the method section. For example, the criteria for searching the virulenceassociated and antibiotic-associated genes listed at supplementary table 3 and 4 is not provided. The readers do not know how the authors run the software and how to obtain these results.
The parameters used for raw read processing and assembly were included in our prior manuscript (beginning Line 330). We neglected to mention that completeness and contamination were assessed via PATRIC during the PATRIC annotation process. We have included this information (Lines 338-339).
We have explicitly listed the parameters (if there were any) for the MLST, SpaTyper, SCCmecFinder, and VFAnalyzer analyses (beginning Line 350).
2. I would also suggest the authors to upload their scripts or commands for bioinformatic analyses online publicly. It can benefit the reproducibility and provide a guideline for the future applications as well.
No new scripts were developed as part of this work. We recognize that this may have been misleading when we referred to the anvi'o "scripts". These are in fact functions that can be called through the anvi'o environment. For the identification of the single-copy core genome we did specify that only gene clusters that were conserved among all genomes and occur once per genome were selected; we have now explicitly listed these parameters (beginning Line 363). Unless noted, default parameters for these anvi'o functions were used; anvi'o offers fantastic documentation to support users unfamiliar with these functions. For phylogenomic and phylogenetic tree derivation and visualization, default parameters were used, and we have explicitly stated this in the text (Line 371-375).
3. From the S. aureus clinical isolates, several unknown MLSTs were found. It will be great if the authors can show more details about these unknown MLSTs. For example, the unknown MLST in figure 3 is quite close to how  6. Based on geographical location, Middle East is the bridge between Asia and Europe. It will be interesting to see the phylogenomic analysis of the strains from Asia, Middle East and Europe. The spread and evolution of Staphylococci can be understood. If the authors can provide such information, the value of this study can be increased.
We certainly agree that such an examination would be quite interesting. The article "Phylogenomic study of Staphylococcus aureus and Staphylococcus haemolyticus clinical isolates from Egypt" gives a distinct look at the two most pathogenic Staphylococcus species and types/clades of these species that are circulating in the Middle Eastern region, mainly Egypt. Additionally, the authors isolate 4 newly identified, distinct genotype strains based on their analyses. While this is certainly valuable information, particularly because the strains were acquired from active hospital infections, some of the importance and novelty of this study is lost in the type/clade jargon that is used without any indication of why these distinct types/clades are different or important to acknowledge. The manuscript could be improved by keeping a broader audience in mind who may be interested in phylogenetics or antibiotic resistance without the specific knowledge of jargon used in the S. aureus field in relation to strain types/clades. Two of the 4 figures in the manuscript refer to these types/clades, so a more thorough introduction and discussion would be very beneficial to comprehending the importance of the manuscript.
We recognize that our prior manuscript made it difficult to ascertain which strains belonged to which clade. We have revised Fig. 3 such that the reader can easily identify the clades within the tree. This enables easier interpretation of the discussion of these clades. We have also added this part to the introduction (Lines 96-99) "In that respect, WGS can be used to identify outbreak clones or clades, which are a group of independent isolates that share phenotypic and genotypic traits, most likely have a common ancestor, and form a branch on a phylogenetic tree (17-19)".
Staff Comments:

Preparing Revision Guidelines
To submit your modified manuscript, log onto the eJP submission site at https://spectrum.msubmit.net/cgi-bin/main.plex. Go to Author Tasks and click the appropriate manuscript title to begin the revision process. The information that you entered when you first submitted the paper will be displayed. Please update the information as necessary.
Here are a few examples of required updates that authors must address: • Point-by-point responses to the issues raised by the reviewers in a file named "Response to Reviewers," NOT IN YOUR COVER LETTER.
• Upload a compare copy of the manuscript (without figures) as a "Marked-Up Manuscript" file.
• Each figure must be uploaded as a separate file, and any multipanel figures must be assembled into one file. For complete guidelines on revision requirements, please see the journal Submission and Review Process requirements at https://journals.asm.org/journal/Spectrum/submission-review-process. Submissions of a paper that does not conform to Microbiology Spectrum guidelines will delay acceptance of your manuscript. " Please return the manuscript within 60 days; if you cannot complete the modification within this time period, please contact me. If you do not wish to modify the manuscript and prefer to submit it to another journal, please notify me of your decision immediately so that the manuscript may be formally withdrawn from consideration by Microbiology Spectrum.
If your manuscript is accepted for publication, you will be contacted separately about payment when the proofs are issued; please follow the instructions in that e-mail. Arrangements for payment must be made before your article is published. For a complete list of Publication Fees, including supplemental material costs, please visit our website.